Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 135617 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 16.6 MiB |
| Average record size in memory | 128.0 B |
Variable types
| NUM | 11 |
|---|---|
| BOOL | 5 |
Reproduction
| Analysis started | 2021-05-24 04:29:36.294771 |
|---|---|
| Analysis finished | 2021-05-24 04:30:29.121268 |
| Duration | 52.83 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
add_group is highly correlated with add_friend | High correlation |
add_friend is highly correlated with add_group | High correlation |
finish_num is highly correlated with learn_num | High correlation |
learn_num is highly correlated with finish_num | High correlation |
coupon is highly skewed (γ1 = 58.57520335) | Skewed |
user_id has unique values | Unique |
login_diff_time has 12476 (9.2%) zeros | Zeros |
distance_day has 1438 (1.1%) zeros | Zeros |
login_time has 7932 (5.8%) zeros | Zeros |
launch_time has 87751 (64.7%) zeros | Zeros |
camp_num has 9741 (7.2%) zeros | Zeros |
learn_num has 27334 (20.2%) zeros | Zeros |
finish_num has 45739 (33.7%) zeros | Zeros |
coupon has 122104 (90.0%) zeros | Zeros |
course_order_num has 127009 (93.7%) zeros | Zeros |
| Distinct count | 135617 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2000002535200329.5 |
|---|---|
| Minimum | 2000001555945280 |
| Maximum | 2000002948014779 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.0 MiB |
Quantile statistics
| Minimum | 2.000001556e+15 |
|---|---|
| 5-th percentile | 2.000002282e+15 |
| Q1 | 2.000002415e+15 |
| median | 2.000002499e+15 |
| Q3 | 2.000002718e+15 |
| 95-th percentile | 2.000002919e+15 |
| Maximum | 2.000002948e+15 |
| Range | 1392069499 |
| Interquartile range (IQR) | 303046960 |
Descriptive statistics
| Standard deviation | 249996404 |
|---|---|
| Coefficient of variation (CV) | 1.249980436e-07 |
| Kurtosis | 3.719889271 |
| Mean | 2.000002535e+15 |
| Median Absolute Deviation (MAD) | 114796193 |
| Skewness | -1.247888558 |
| Sum | -5.466817289e+18 |
| Variance | 6.249820203e+16 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2.000002763e+15 | 1 | < 0.1% | |
| 2.000002458e+15 | 1 | < 0.1% | |
| 2.000002739e+15 | 1 | < 0.1% | |
| 2.000002421e+15 | 1 | < 0.1% | |
| 2.000001659e+15 | 1 | < 0.1% | |
| 2.00000244e+15 | 1 | < 0.1% | |
| 2.000002356e+15 | 1 | < 0.1% | |
| 2.000002829e+15 | 1 | < 0.1% | |
| 2.000002421e+15 | 1 | < 0.1% | |
| 2.000002753e+15 | 1 | < 0.1% | |
| Other values (135607) | 135607 | > 99.9% |
| Value | Count | Frequency (%) | |
| 2.000001556e+15 | 1 | < 0.1% | |
| 2.000001557e+15 | 1 | < 0.1% | |
| 2.000001558e+15 | 1 | < 0.1% | |
| 2.000001558e+15 | 1 | < 0.1% | |
| 2.000001558e+15 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2.000002948e+15 | 1 | < 0.1% | |
| 2.000002947e+15 | 1 | < 0.1% | |
| 2.000002947e+15 | 1 | < 0.1% | |
| 2.000002947e+15 | 1 | < 0.1% | |
| 2.000002947e+15 | 1 | < 0.1% |
login_day
Real number (ℝ)
| Distinct count | 50 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.183258735999174 |
|---|---|
| Minimum | -1 |
| Maximum | 108 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.0 MiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 8 |
| Maximum | 108 |
| Range | 109 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.363427555 |
|---|---|
| Coefficient of variation (CV) | 0.5649728368 |
| Kurtosis | 125.8400782 |
| Mean | 4.183258736 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 3.611565193 |
| Sum | 567321 |
| Variance | 5.585789808 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 5 | 21229 | 15.7% | |
| 4 | 20035 | 14.8% | |
| 6 | 19865 | 14.6% | |
| 3 | 18938 | 14.0% | |
| 2 | 17069 | 12.6% | |
| 7 | 14805 | 10.9% | |
| 1 | 12476 | 9.2% | |
| 8 | 6212 | 4.6% | |
| -1 | 4200 | 3.1% | |
| 9 | 481 | 0.4% | |
| Other values (40) | 307 | 0.2% |
| Value | Count | Frequency (%) | |
| -1 | 4200 | 3.1% | |
| 1 | 12476 | 9.2% | |
| 2 | 17069 | 12.6% | |
| 3 | 18938 | 14.0% | |
| 4 | 20035 | 14.8% |
| Value | Count | Frequency (%) | |
| 108 | 1 | < 0.1% | |
| 102 | 1 | < 0.1% | |
| 101 | 1 | < 0.1% | |
| 91 | 1 | < 0.1% | |
| 84 | 1 | < 0.1% |
| Distinct count | 662 |
|---|---|
| Unique (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.0862624892159536 |
|---|---|
| Minimum | -1.0 |
| Maximum | 135.0 |
| Zeros | 12476 |
| Zeros (%) | 9.2% |
| Memory size | 1.0 MiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.75 |
| median | 1 |
| Q3 | 1.2 |
| 95-th percentile | 2 |
| Maximum | 135 |
| Range | 136 |
| Interquartile range (IQR) | 0.45 |
Descriptive statistics
| Standard deviation | 1.933017576 |
|---|---|
| Coefficient of variation (CV) | 1.779512407 |
| Kurtosis | 524.840013 |
| Mean | 1.086262489 |
| Median Absolute Deviation (MAD) | 0.25 |
| Skewness | 17.49091975 |
| Sum | 147315.66 |
| Variance | 3.736556951 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1 | 29823 | 22.0% | |
| 0 | 12476 | 9.2% | |
| 0.5 | 7604 | 5.6% | |
| 0.83 | 6966 | 5.1% | |
| 0.8 | 6863 | 5.1% | |
| 0.86 | 6687 | 4.9% | |
| 1.17 | 6400 | 4.7% | |
| 0.75 | 6154 | 4.5% | |
| 0.67 | 6111 | 4.5% | |
| 0.88 | 5625 | 4.1% | |
| Other values (652) | 40908 | 30.2% |
| Value | Count | Frequency (%) | |
| -1 | 4200 | 3.1% | |
| 0 | 12476 | 9.2% | |
| 0.5 | 7604 | 5.6% | |
| 0.67 | 6111 | 4.5% | |
| 0.75 | 6154 | 4.5% |
| Value | Count | Frequency (%) | |
| 135 | 1 | < 0.1% | |
| 85.33 | 1 | < 0.1% | |
| 78 | 1 | < 0.1% | |
| 74.8 | 1 | < 0.1% | |
| 71.2 | 1 | < 0.1% |
| Distinct count | 353 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 136.36451919744576 |
|---|---|
| Minimum | -1275 |
| Maximum | 6588 |
| Zeros | 1438 |
| Zeros (%) | 1.1% |
| Memory size | 1.0 MiB |
Quantile statistics
| Minimum | -1275 |
|---|---|
| 5-th percentile | 11 |
| Q1 | 38 |
| median | 84 |
| Q3 | 180 |
| 95-th percentile | 377 |
| Maximum | 6588 |
| Range | 7863 |
| Interquartile range (IQR) | 142 |
Descriptive statistics
| Standard deviation | 135.5882317 |
|---|---|
| Coefficient of variation (CV) | 0.9943072622 |
| Kurtosis | 85.20759251 |
| Mean | 136.3645192 |
| Median Absolute Deviation (MAD) | 57 |
| Skewness | 3.004213569 |
| Sum | 18493347 |
| Variance | 18384.16859 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| -1 | 4228 | 3.1% | |
| 21 | 2807 | 2.1% | |
| 20 | 2599 | 1.9% | |
| 374 | 2491 | 1.8% | |
| 22 | 2284 | 1.7% | |
| 373 | 2181 | 1.6% | |
| 379 | 2122 | 1.6% | |
| 43 | 2007 | 1.5% | |
| 375 | 1999 | 1.5% | |
| 44 | 1995 | 1.5% | |
| Other values (343) | 110904 | 81.8% |
| Value | Count | Frequency (%) | |
| -1275 | 1 | < 0.1% | |
| -981 | 1 | < 0.1% | |
| -23 | 1 | < 0.1% | |
| -15 | 1 | < 0.1% | |
| -14 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 6588 | 1 | < 0.1% | |
| 6482 | 1 | < 0.1% | |
| 4393 | 1 | < 0.1% | |
| 3006 | 1 | < 0.1% | |
| 2665 | 1 | < 0.1% |
| Distinct count | 644 |
|---|---|
| Unique (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.096684044035776 |
|---|---|
| Minimum | 0 |
| Maximum | 1480 |
| Zeros | 7932 |
| Zeros (%) | 5.8% |
| Memory size | 1.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 7 |
| median | 21 |
| Q3 | 44 |
| 95-th percentile | 140 |
| Maximum | 1480 |
| Range | 1480 |
| Interquartile range (IQR) | 37 |
Descriptive statistics
| Standard deviation | 57.63938892 |
|---|---|
| Coefficient of variation (CV) | 1.512976532 |
| Kurtosis | 37.29703934 |
| Mean | 38.09668404 |
| Median Absolute Deviation (MAD) | 17 |
| Skewness | 4.58075032 |
| Sum | 5166558 |
| Variance | 3322.299156 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1 | 8539 | 6.3% | |
| 0 | 7932 | 5.8% | |
| 2 | 5387 | 4.0% | |
| 3 | 3412 | 2.5% | |
| 4 | 2653 | 2.0% | |
| 14 | 2645 | 2.0% | |
| 16 | 2561 | 1.9% | |
| 15 | 2556 | 1.9% | |
| 13 | 2548 | 1.9% | |
| 12 | 2535 | 1.9% | |
| Other values (634) | 94849 | 69.9% |
| Value | Count | Frequency (%) | |
| 0 | 7932 | 5.8% | |
| 1 | 8539 | 6.3% | |
| 2 | 5387 | 4.0% | |
| 3 | 3412 | 2.5% | |
| 4 | 2653 | 2.0% |
| Value | Count | Frequency (%) | |
| 1480 | 1 | < 0.1% | |
| 1339 | 1 | < 0.1% | |
| 1166 | 1 | < 0.1% | |
| 1156 | 1 | < 0.1% | |
| 1107 | 1 | < 0.1% |
| Distinct count | 23 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5111158630555166 |
|---|---|
| Minimum | 0 |
| Maximum | 76 |
| Zeros | 87751 |
| Zeros (%) | 64.7% |
| Memory size | 1.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 76 |
| Range | 76 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.8905220551 |
|---|---|
| Coefficient of variation (CV) | 1.742309561 |
| Kurtosis | 485.042345 |
| Mean | 0.5111158631 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 8.761561022 |
| Sum | 69316 |
| Variance | 0.7930295307 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 87751 | 64.7% | |
| 1 | 32994 | 24.3% | |
| 2 | 10451 | 7.7% | |
| 3 | 3145 | 2.3% | |
| 4 | 902 | 0.7% | |
| 5 | 247 | 0.2% | |
| 6 | 72 | 0.1% | |
| 7 | 23 | < 0.1% | |
| 8 | 9 | < 0.1% | |
| 16 | 4 | < 0.1% | |
| Other values (13) | 19 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 87751 | 64.7% | |
| 1 | 32994 | 24.3% | |
| 2 | 10451 | 7.7% | |
| 3 | 3145 | 2.3% | |
| 4 | 902 | 0.7% |
| Value | Count | Frequency (%) | |
| 76 | 1 | < 0.1% | |
| 47 | 1 | < 0.1% | |
| 37 | 1 | < 0.1% | |
| 27 | 2 | < 0.1% | |
| 25 | 1 | < 0.1% |
chinese_subscribe_num
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.0 MiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 94045 | 69.3% | |
| 1 | 41572 | 30.7% |
math_subscribe_num
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.0 MiB |
| 0 | |
|---|---|
| 1 | 9991 |
| Value | Count | Frequency (%) | |
| 0 | 125626 | 92.6% | |
| 1 | 9991 | 7.4% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.0 MiB |
| 1 | |
|---|---|
| 0 | 518 |
| Value | Count | Frequency (%) | |
| 1 | 135099 | 99.6% | |
| 0 | 518 | 0.4% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.0 MiB |
| 1 | |
|---|---|
| 0 | 518 |
| Value | Count | Frequency (%) | |
| 1 | 135099 | 99.6% | |
| 0 | 518 | 0.4% |
| Distinct count | 7 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.6076008170067175 |
|---|---|
| Minimum | 0 |
| Maximum | 6 |
| Zeros | 9741 |
| Zeros (%) | 7.2% |
| Memory size | 1.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 3 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.9602472176 |
|---|---|
| Coefficient of variation (CV) | 0.5973169505 |
| Kurtosis | 1.246628902 |
| Mean | 1.607600817 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.9132631033 |
| Sum | 218018 |
| Variance | 0.9220747189 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1 | 61061 | 45.0% | |
| 2 | 45114 | 33.3% | |
| 3 | 13703 | 10.1% | |
| 0 | 9741 | 7.2% | |
| 4 | 4405 | 3.2% | |
| 5 | 1558 | 1.1% | |
| 6 | 35 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 9741 | 7.2% | |
| 1 | 61061 | 45.0% | |
| 2 | 45114 | 33.3% | |
| 3 | 13703 | 10.1% | |
| 4 | 4405 | 3.2% |
| Value | Count | Frequency (%) | |
| 6 | 35 | < 0.1% | |
| 5 | 1558 | 1.1% | |
| 4 | 4405 | 3.2% | |
| 3 | 13703 | 10.1% | |
| 2 | 45114 | 33.3% |
| Distinct count | 26 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.3129548655404557 |
|---|---|
| Minimum | 0 |
| Maximum | 25 |
| Zeros | 27334 |
| Zeros (%) | 20.2% |
| Memory size | 1.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 9 |
| Maximum | 25 |
| Range | 25 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.966820872 |
|---|---|
| Coefficient of variation (CV) | 0.8955210657 |
| Kurtosis | 1.730926293 |
| Mean | 3.312954866 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 1.129273811 |
| Sum | 449293 |
| Variance | 8.802026085 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 27334 | 20.2% | |
| 1 | 18463 | 13.6% | |
| 4 | 16550 | 12.2% | |
| 5 | 16522 | 12.2% | |
| 3 | 16041 | 11.8% | |
| 2 | 15979 | 11.8% | |
| 6 | 6903 | 5.1% | |
| 7 | 4897 | 3.6% | |
| 8 | 4155 | 3.1% | |
| 9 | 3121 | 2.3% | |
| Other values (16) | 5652 | 4.2% |
| Value | Count | Frequency (%) | |
| 0 | 27334 | 20.2% | |
| 1 | 18463 | 13.6% | |
| 2 | 15979 | 11.8% | |
| 3 | 16041 | 11.8% | |
| 4 | 16550 | 12.2% |
| Value | Count | Frequency (%) | |
| 25 | 1 | < 0.1% | |
| 24 | 2 | < 0.1% | |
| 23 | 5 | < 0.1% | |
| 22 | 7 | < 0.1% | |
| 21 | 7 | < 0.1% |
| Distinct count | 26 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.692140365883333 |
|---|---|
| Minimum | 0 |
| Maximum | 25 |
| Zeros | 45739 |
| Zeros (%) | 33.7% |
| Memory size | 1.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 2 |
| Q3 | 5 |
| 95-th percentile | 8 |
| Maximum | 25 |
| Range | 25 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 2.886858906 |
|---|---|
| Coefficient of variation (CV) | 1.072328524 |
| Kurtosis | 2.110070635 |
| Mean | 2.692140366 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 1.273291544 |
| Sum | 365100 |
| Variance | 8.333954345 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 45739 | 33.7% | |
| 5 | 16645 | 12.3% | |
| 1 | 15446 | 11.4% | |
| 3 | 13565 | 10.0% | |
| 2 | 13277 | 9.8% | |
| 4 | 12637 | 9.3% | |
| 6 | 5512 | 4.1% | |
| 7 | 3464 | 2.6% | |
| 8 | 3094 | 2.3% | |
| 10 | 2200 | 1.6% | |
| Other values (16) | 4038 | 3.0% |
| Value | Count | Frequency (%) | |
| 0 | 45739 | 33.7% | |
| 1 | 15446 | 11.4% | |
| 2 | 13277 | 9.8% | |
| 3 | 13565 | 10.0% | |
| 4 | 12637 | 9.3% |
| Value | Count | Frequency (%) | |
| 25 | 1 | < 0.1% | |
| 24 | 1 | < 0.1% | |
| 23 | 6 | < 0.1% | |
| 22 | 5 | < 0.1% | |
| 21 | 7 | < 0.1% |
study_num
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.0 MiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 112610 | 83.0% | |
| 1 | 23007 | 17.0% |
| Distinct count | 33 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.15896974568085123 |
|---|---|
| Minimum | 0 |
| Maximum | 112 |
| Zeros | 122104 |
| Zeros (%) | 90.0% |
| Memory size | 1.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 112 |
| Range | 112 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.9146963991 |
|---|---|
| Coefficient of variation (CV) | 5.753902387 |
| Kurtosis | 6206.353908 |
| Mean | 0.1589697457 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 58.57520335 |
| Sum | 21559 |
| Variance | 0.8366695025 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 122104 | 90.0% | |
| 1 | 9563 | 7.1% | |
| 2 | 2437 | 1.8% | |
| 3 | 788 | 0.6% | |
| 4 | 340 | 0.3% | |
| 5 | 166 | 0.1% | |
| 6 | 78 | 0.1% | |
| 7 | 39 | < 0.1% | |
| 9 | 22 | < 0.1% | |
| 8 | 21 | < 0.1% | |
| Other values (23) | 59 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 122104 | 90.0% | |
| 1 | 9563 | 7.1% | |
| 2 | 2437 | 1.8% | |
| 3 | 788 | 0.6% | |
| 4 | 340 | 0.3% |
| Value | Count | Frequency (%) | |
| 112 | 2 | < 0.1% | |
| 108 | 1 | < 0.1% | |
| 102 | 1 | < 0.1% | |
| 54 | 1 | < 0.1% | |
| 51 | 1 | < 0.1% |
| Distinct count | 22 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.10562097672120752 |
|---|---|
| Minimum | 0 |
| Maximum | 24 |
| Zeros | 127009 |
| Zeros (%) | 93.7% |
| Memory size | 1.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 24 |
| Range | 24 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.5528032321 |
|---|---|
| Coefficient of variation (CV) | 5.233839426 |
| Kurtosis | 237.5985816 |
| Mean | 0.1056209767 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 11.54324799 |
| Sum | 14324 |
| Variance | 0.3055914135 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 127009 | 93.7% | |
| 1 | 5816 | 4.3% | |
| 2 | 1578 | 1.2% | |
| 3 | 580 | 0.4% | |
| 4 | 289 | 0.2% | |
| 5 | 136 | 0.1% | |
| 6 | 82 | 0.1% | |
| 7 | 33 | < 0.1% | |
| 8 | 28 | < 0.1% | |
| 9 | 13 | < 0.1% | |
| Other values (12) | 53 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 127009 | 93.7% | |
| 1 | 5816 | 4.3% | |
| 2 | 1578 | 1.2% | |
| 3 | 580 | 0.4% | |
| 4 | 289 | 0.2% |
| Value | Count | Frequency (%) | |
| 24 | 1 | < 0.1% | |
| 23 | 1 | < 0.1% | |
| 20 | 1 | < 0.1% | |
| 19 | 1 | < 0.1% | |
| 18 | 2 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| user_id | login_day | login_diff_time | distance_day | login_time | launch_time | chinese_subscribe_num | math_subscribe_num | add_friend | add_group | camp_num | learn_num | finish_num | study_num | coupon | course_order_num | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2000001555945280 | 7 | 6.86 | 131 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 4 |
| 1 | 2000001556645228 | 4 | 1.00 | 81 | 3 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 0 | 0 | 0 | 0 |
| 2 | 2000001558047804 | 1 | 0.00 | 179 | 3 | 0 | 1 | 0 | 1 | 1 | 2 | 0 | 0 | 0 | 0 | 0 |
| 3 | 2000001558146467 | 6 | 1.00 | 32 | 24 | 3 | 0 | 0 | 1 | 1 | 1 | 5 | 5 | 0 | 0 | 1 |
| 4 | 2000001558146878 | 4 | 1.75 | 361 | 39 | 0 | 0 | 1 | 1 | 1 | 2 | 0 | 0 | 1 | 0 | 0 |
| 5 | 2000001558147371 | 4 | 1.25 | 46 | 68 | 2 | 1 | 0 | 1 | 1 | 2 | 1 | 1 | 1 | 0 | 0 |
| 6 | 2000001559045233 | 1 | 0.00 | 182 | 8 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| 7 | 2000001559245920 | 4 | 1.50 | 36 | 38 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| 8 | 2000001559246037 | 4 | 1.75 | 368 | 6 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 |
| 9 | 2000001559247825 | 4 | 1.75 | 22 | 31 | 0 | 0 | 0 | 1 | 1 | 2 | 6 | 6 | 0 | 0 | 0 |
Last rows
| user_id | login_day | login_diff_time | distance_day | login_time | launch_time | chinese_subscribe_num | math_subscribe_num | add_friend | add_group | camp_num | learn_num | finish_num | study_num | coupon | course_order_num | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 135607 | 2000002947316296 | 1 | 0.0 | 0 | 201 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| 135608 | 2000002947316932 | 1 | 0.0 | 0 | 2 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| 135609 | 2000002947317071 | 1 | 0.0 | 0 | 2 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| 135610 | 2000002947317483 | 1 | 0.0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| 135611 | 2000002947317621 | 1 | 0.0 | 0 | 7 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
| 135612 | 2000002947317726 | 1 | 0.0 | 0 | 2 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| 135613 | 2000002947317758 | 1 | 0.0 | 0 | 2 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| 135614 | 2000002947317827 | -1 | -1.0 | -1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| 135615 | 2000002947317941 | 1 | 0.0 | 0 | 393 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| 135616 | 2000002948014779 | 1 | 0.0 | 0 | 4 | 0 | 0 | 0 | 1 | 1 | 2 | 0 | 0 | 0 | 0 | 0 |